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1 ABSTRACT: Today, accounting systems data are placed in databases, allowing users to 

2 query data without going through the preprogrammed accounting system or learn programming 

3 in order to get at the data. There has been a great deal of experimentation with different ways to 

4 accomplish the database approach to business processing, and 3 approaches have turned out to be 

5 the most common: 1. the hierarchical or tree structure, 2. the network or plex structure, and 3. 

6 the relational or table structure. The relational approach is based on tables of data in roles and 

7 columns, with operations defined on those tables. Among other things, the relational database 

8 allows relationships between tables to be created later, after the data tables have been developed 

9 and the data entered. As accounting data processing moves away from centralized mainframe 

10 processing it moves toward either decentralized processing, with totally separate databases, or 

11 distributed database systems. 

12 TEXT: All information begins as data. The only thing more important than that is how the 

13 data are organized when they are stored. 

14 Every day at General Motors 1,183 mainframes process 17 million transactions and borrow 

15 $1.7 billion from 700 banks. The IRS receives more than 500 million informational returns each 

16 year. VISA has 77.2 million cards generating $60.6 billion dollars in charges, with 25 million 

17 cards used in 1,564 automated teller machines in 25 states. 

18 As you can see, acounting data are at the heart of any company's information system, 

19 regardless of the level of computer sophistication. Yet, until recently, only trained computer 

20 professionals could access computerized data. Users could not access the data directly, so they 

21 were not as useful as they could have been. 

22 With the traditional data access approach, queries were difficult. A separate computer 

23 pro gram was required for every type of analysis, and it was hard to get a ccess to da ta for purposes 

24 other than those planned for originally and thus preprogrammed into the accounting system . Now 

25 accounting systems data are placed in a database. Accounting programs, such as transaction 

26 processing and financial reporting, remain much as they were, but the database is accessible 

27 directly with tools that the end user can handle. With this approach, virtually anyone can query 

28 the database. The user does not have to go through the preprogrammed accounting system or learn 

29 programming in order to get to the data. 
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30 BENEFITS AND COSTS OF THE DATABASE APPROACH 

31 Problems with retrieving data in both batch and interactive processing systems using^the 

32 traditional file approach led to the basic concept behind the database. With the database there is 

33 one set of uniquely defined data items, and all computer applications use the same data items that 

34 are separate from the applications that use them. This setup allows analysis of the same data across 

35 applications. It also means that the applications and the data can be changed independent of each 

36 other, so data can be added to, modified, or deleted from the database without the programs using 

37 them being affected. 

38 For example, a company may have one set of prices for materials used by inventory 

39 control for costing issues, another set of prices in the engineering department used for design of 

40 new or revised products, and still another set of prices used by the purchasing department for 

41 determining sources. These different sets of prices are updated at different times by different 

42 people from different information. Needless to say, the prices probably rarely agree, even though 

43 they presumably represent the same thing. The database approach to this problem is to have one 

44 set of prices for materials and then have each application use the same information. 

45 The database approach also has simplified applications development. All file systems have 

46 the same basic components for file creation, maintenance, transaction processing, and report 

47 writing. Once the applications are separated from the data, these programs can be developed just 

48 once for all application data. Previously, there had been much duplication of effort in the 

49 development of these programs because they were tied to the specific files they used. 

50 But the database approach is not without its costs. The main cost involves coordination. 

51 If the same number will uniquely identify a supplier in the ordering system, the accounts payable 

52 system, and the inventory system, someone or some group must coordinate the design of these 

53 systems. The business cannot allow separate groups to develop systems independently. But the 

54 price of coordination can be higher than a company is willing to pay. Also, because each system 

55 is not designed unto itself, certain compromises must be accepted in individual components so that 

56 the total system will fit together. As a result, each component may not be optimal for a particular 

57 task, frequently a concern of users who are more interested in optimizing one specific subsystem, 

58 such as inventory, than in optimizing the overall company's operations. 

59 RELATIONAL DATABASES 

60 Businesses use data items, records, and files to keep track of their operations and 

61 accounting data. The most useful way to visualize data items and records is to see them as a table 

62 of information called a flat file. The term flat file is used because the information can be viewed 

63 in two dimensions: rows (records) and columns (data items) similar to tables of data in a book. 

64 Table 1A has a row for each (customer) record in the file and a column for each of the four data 

65 items. For convenience, the name of each data item, such as customer number, is at the top of the 

66 appropriate column. 

67 For a flat file to be able to store and analyze data, it must have the following 

68 characteristics: 

69 1. All items in each column must be the same kind of data, such as a customer number, 

70 a customer name, or a customer address. 

71 2. Each column must have its own unique name, separate from all others. In this case, the 
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72 names are cust-no, cust-name, cust-addrl, and cust-addr2. 

73 3. All rows must be different in at least one data item from every other row. In other 

74 words, two rows of data cannot be exactly alike. If two rows are alike, either they refer to the 

75 same customer, so the duplicate can be eliminated, or they refer to two separate customers, in 

76 which case there must be one or more data items to distinguish between those customers. 

77 4. Every cell (the intersection of a row and a column) contains only one data item. Thus 

78 every customer has exactly one cust-no, one cust-name, and so on. 

79 Having one and only one data item per cell is significant to the design and number of flat 

80 files. Suppose a manager is interested in all invoices for a particular customer. Some customers 

81 will have only one invoice while others will have two, four, or more. These invoices would not, 

82 therefore, fit into a flat file without some modification because each cell must contain only one 

83 item. A situation such as multiple invoices for a customer is called a repeating group because there 

84 potentially is more than one data item for a customer. Fortunately, repeating groups can be dealt 

85 with by forming two flat files. The process of forming additional flat files from the repeating 

86 groups is called normalization. 

87 Another name for a flat file is a relation because it represents a relationship among the 

88 various data items of the file. In Table 1C, the relation is that all the data items in one record 

89 represent one customer. On the other hand, there are two relationships shown in Table 1, one of 

90 customers (A) and one of invoices (B). In B, there is one record for each invoice. If there are 

91 three invoices for each of six customers, there would be 18 invoices and, thus, 18 records in the 

92 invoice file and 18 rows in the table. But there would be only six customer records in the customer 

93 relation (flat file). 

94 Some database software can process only flat files, but because it is possible to transform 

95 any repeating group into a series of flat files, this restriction is not serious. Databases that are 

96 based on flat files and relations are called relational databases. This concept of transforming 

97 repeating groups into flat files does create duplication of data. In the invoice example, the 

98 customer number is repeated in both the customer file and the invoice file. But relational databases 

99 are easy to use and provide flexibility in handling the data, which in most applications outweighs 

100 this repetition of data. 

101 RELATIONAL DATABASE MANAGEMENT SYSTEMS (RDBMS) 

102 (Table Omitted) Captioned as: Table 1 

103 There has been a great deal of experimentation with different ways to accomplish the 

104 database approach to business processing. Three approaches have turned out to be the most 

105 common: the hierarchical or tree structure, the network or plex structure, and the relational or 

106 table structure. 

107 Hierarchical or tree approach. The first major database was developed at Rockwell 

108 International for the purpose of tracking the development of the Apollo space program. The 

109 resulting computer program later became known as IMS (Information Management System) when 

110 it was sold by IBM. This database had a hierarchical focus: The product (spacecraft) was 

111 composed of subassemblies and parts, and each subassembly was composed of further 
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112 subassemblies and parts. Eventually, all subassemblies were broken down into their component 

113 parts. A complete breakdown of a product into its component parts often is called a bill of 

114 materials. 

115 A hierarchical file, such as a bill of materials detailing the components of a manufactured 

116 product, has a tree structure relationship between the records of the file. (See Figure 1.) A tree 

117 is composed of a hierarchy of elements called nodes. The uppermost level of the hierarchy has 

118 only one node, called the root. In our example, this root would correspond to the finished 

1 19 spacecraft. With the exception of the root, every node has another node related to it at a higher 

120 level, called its parent. No element can have more than one parent. Each parent can have one or 

121 more elements related to it at a lower level, called children; they would be the subassemblies and 

122 the parts that compose them. Elements with no nodes in the next level down are called leaves, 

123 which would be individual parts with no assembly. Therefore, each node (component) has only 

124 one parent (the component it goes into), the root has no parent (because it is complete), and leaves 

125 have no children (because they are not assembled). Note that if you look at each element in Level 

126 2 of Figure 1 and think of it as a root, then its children and descendants form a tree. A master file 

127 transaction structure also can be thought of as a hierarchical or tree file structure. 

128 The hierarchical approach has the advantage of extremely fast transaction processing. The 

129 disadvantages of this approach are that it is extremely complex to set up, often requiring months 

130 for the initial project, and it is very difficult to maintain and change as circumstances and data 

131 change. Consequently, the hierarchical approach and the IMS program are suitable only for highly 

132 structured and extremely highvolume transaction processing environments. 

133 Network or plex approach. The network approach (or plex structure) exists somewhere 

134 between the hierarchical and relational approaches in both speed and ease of use and has fallen 

135 somewhat out of favor. Users requiring speed choose the hierarchical approach, while those 

136 desiring ease of use choose the relational approach. As a result, there is virtually no new 

137 development of network DBMSs or new applications using a network DBMS. 

138 If a child in a data relationship has more than one parent, the structure is a network or plex 

139 structure. As in tree structures, plex structures may have levels. Figure 2 shows the network 

140 structure of a purchasing system with five record types. Each relationship is a parent-child 

141 relationship. The purchase order record type is a child of the part (that is, inventory item) record 

142 type and a parent of the purchase item record type. A more complex structure is a oneto-many 

143 relationship in both directions between part and purchase order. Each part (inventory item) can 

144 be purchased using many different purchase orders, and each purchase order can be used for many 

145 different parts. 

146 The network or plex structure approach is easier to use (although slower) than the rigid tree 

147 or hierarchical approach, and it is the basis for the earlier market success of non-IBM systems 

148 such as integrated data management systems (IDMS). The major difficulty with the network 

149 approach is that the queries that the system can answer efficiently must be designed into the 

150 system. Queries, however, often arise that have not been planned for. 

151 Relational approach. In 1969 at IBM, mathematician E.F. Codd developed the relational 

152 theory of data that he proposed as a universal foundation for database systems. Codd's theory 

153 formed the basis for all further work in this area. A relational DBMS (RDBMS) satisfies four 

154 conditions. 
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155 Information- All information in the RDBMS is represented in one way only-as values in 

156 tables, which allows users to access, understand, and manipulate data more easily. Each data value 

157 should be accessible by the combination of the name of the table in which the data are stored, the 

158 name of the column under which stored, and the primary key that identifies the row in which 

159 stored. 

160 (Chart Omitted) Captioned as: Figure 1. 

161 (Chart Omitted) Captioned as: Figure 2. 

162 Relational lang ua ge-The RDBMS requires a data lang ua ge to define data; define log ical 

163 displays of data ; manipulate data ; establish rules to prevent errors, such as acceptable values for 

164 codes or a required connection between a master file and a transaction file: a nd maintain 

165 authorization of those users able to access the data . This language must be able to process entire 

166 tables (not row-by-row) for queries and also for data modification, insertion, and deletion. 

167 Independence-Users must not be required to modify queries or application programs if the 

168 database has been reorganized and there is no loss of information in the base tables. This would 

169 include instances where data are moved (distributed) from one computer to another to be closer 

170 to their source or to the location where they are used more often. Integrity management, that is, 

171 the maintenance of required links between tables and the validity of values for the data items, 

172 should not be duplicated in each application but are implemented by the DBMS 

173 Views-The RDBMS must be able to create logical tables (called views) from the base 

174 tables. For example, the information about an individual that a payroll clerk is authorized to look 

175 at could be put into a view that the payroll clerk would be able to access. The view could be 

176 queried and processed just as a base table would be, even though the original data are not repeated 

177 in the view and all of the data are stored only once in the base table. 

178 ADVANTAGES OF THE RELATIONAL APPROACH 

179 The relational approach is based on tables of data in rows and columns, with operations 

180 defined on those tables. Yet these tables must possess the four characteristics of the relational 

181 approach described above. The RDBMS (not the user) must ensure that all database tables comply 

182 with these requirements. When they do, the RDBMS is able to apply mathematical operations and 

183 strict logic to them, which eliminates traditional deficiencies of DBMSs and offers significant 

184 practical benefits. The table structure of an RDBMS is simple and familiar. It is general enough 

185 to represent most types of data, is independent of any internal computer mechanisms, and it is 

186 flexible because the user can restructure tables. Transaction processing is slower than with other 

187 approaches, but modifying the structure of files and adding data items (columns) is considerably 

188 easier. Also, the relational approach allows relationships between tables to be created later, after 

189 the data tables have been developed and the data entered. In the hierarchical and network 

190 approaches, allowable queries about the data have to be identified before the database is developed 

191 so that the pointers between files and records can be created along with the database. 

192 Data manipulation by an RDBMS is managed by a well-defined, complete set of 

193 mathematical operations, which always yield tables as results. With relational operations, data 
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194 access no longer needs to be procedural. The user can specify a data request by giving the 

195 operations that must be performed on other tables to derive it. The system translates these requests 

196 into sets of efficient processing steps. A relational DBMS can accumulate information about the 

197 database (such as statistics) in a catalog 

198 to optimize these operations. 

1 99 SYSTEMS AND TRENDS 

200 (Photograph Omitted) Captioned as: Relational databases translate requests for 

201 information into logical processing steps. 

202 Leading relational programs for mainframes are DB2 for the MVS operating system and 

203 SQL/DS for the VM/CMS operating system. An SQL/DS application, for example, can have up 

204 to 70 million rows, hundreds of tables, and thousands of columns. SQL/DS is installed on more 

205 than 7,500 mainframes and costs more than $100,000 per installation. Other leading relational 

206 databases are Oracle and INGRES for minicomputers, dBASE and Paradox for microcomputers. 

207 The mathematical and logical basis of the relational approach makes it a natural candidate 

208 for a database standard. A standard based on the relational model would yield the best of all 

209 worlds: The products that complied would offer both relational features and compatibility with 

210 a defined standard, and the underlying database functions would be the same for all products, 

211 regardless of whether they are designed for a single user on a PC or multiple users in more 

212 sophisticated systems. In addition, tools such as spreadsheets and word processors do operate on 

213 some of these databases. Both the American National Standards Institute and the International 

214 Standards Organization have developed standards, and all DBMSs are moving toward them. 

215 RDBMS are moving toward support of distributed databases, which are databases spread 

216 throughout the computer systems in a network. One benefit of a distributed database is that local 

217 data can be retrieved without any network activity, thus reducing communications costs when 

218 compared with a centralized database at a remote site. Another potential advantage is that each 

219 database can be sized appropriately for its amount of data, the complexity of user requirements, 

220 and the number of users. As the system grows, added demand can be met more easily than with 

221 a centralized system by making smaller changes to existing databases or by adding new databases 

222 to a network. Current RDBMSs deliver these benefits by allowing a collection of database 

223 operations (called a unit of work) to retrieve and update data at a remote site. Future capabilities 

224 will add support for a distributed unit of work, which allows a user to access data at multiple 

225 locations simultaneously. 

226 RDBMSs are moving toward providing access to the data for applications running on 

227 remote computers. This style of distributed computing is called client/server, where the computer 

228 providing access to the data is called the database server, and the remote computer requesting the 

229 data is called the client. 

230 In client/server, one branch of a company in one location may have primary contact and 

231 conduct virtually all transactions with one segment of the company's customers, while other 

232 branches work with other customers. It is more efficient from the standpoint of data storage, 

233 transaction processing, and data communications if each branch maintains the data files for its 
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234 customers while allowing other branches access to the data. This approach is called distributed 

235 data processing because the databases are distributed around the operational locations of the firm. 

236 If the databases in different branches or divisions are not connected and are kept separate, the 

237 company is operating with a decentralized data processing system. 

238 There are four goals for a distributed database system: 

239 A distributed system should appear to each user to be a single, nondistributed system so 

240 that queries and transactions that affect distributed data look no different from local queries and 

241 transactions. 

242 Each location should have local autonomy and not require the approval of some centralized 

243 group for local changes. 

244 A central site should not be required for data storage or processing. 

245 Operation should be continuous. 

246 To achieve these goals, there are several features of a distributed DBMS that should be 

247 transparent and of no concern to users. Table 2 summarizes them. 

248 As accounting data processing moves away from centralized mainframe processing it 

249 moves toward either decentralized processing, with totally separate databases, or distributed 

250 database systems. Proof of the flexibility of the RDBMS is its ability to adjust to the newer, much 

251 more complicated ways of dealing with data. 

252 (Table Omitted) Captioned as: Table 2. 
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1 ABSTRACT: Today, accounting systems data are placed in databases, allowing users to 

2 query data without going through the preprogrammed accounting system or learn programming 

3 in order to get at the data. There has been a great deal of experimentation with different ways to 

4 accomplish the database approach to business processing, and 3 approaches have turned out to be 

5 the most common: 1. the hierarchical or tree structure, 2. the network or plex structure, and 3. 

6 the relational or table structure. The relational approach is based on tables of data in roles and 

7 columns, with operations defined on those tables. Among other things, the relational database 

8 allows relationships between tables to be created later, after the data tables have been developed 

9 and the data entered. As accounting data processing moves away from centralized mainframe 

10 processing it moves toward either decentralized processing, with totally separate databases, or 

1 1 distributed database systems. 

12 TEXT: All information begins as data. The only thing more important than that is how the 

13 data are organized when they are stored. 

14 Every day at General Motors 1,183 mainframes process 17 million transactions and borrow 

15 $1.7 billion from 700 banks. The IRS receives more than 500 million informational returns each 

16 year. VISA has 77.2 million cards generating $60.6 billion dollars in charges, with 25 million 

17 cards used in 1,564 automated teller machines in 25 states. 

18 As you can see, acounting data are at the heart of any company's information system, 

19 regardless of the level of computer sophistication. Yet, until recently, only trained computer 

20 professionals could access computerized data. Users could not access the data directly, so they 

21 were not as useful as they could have been. 

22 With the traditional data access approach, queries were difficult. A separate computer 

23 program was required for every type of analysis, and it was hard to get access to data for purposes 

24 other than those planned for originally and thus preprogrammed into the a ccounting system . Now 

25 accounting systems data are placed in a database. Accounting programs, such as transaction 

26 processing and financial reporting, remain much as they were, but the database is accessible 

27 directly with tools that the end user can handle. With this approach, virtually anyone can query 

28 the database. The user does not have to go through the preprogrammed accounting system or learn 

29 programming in order to get to the data. 
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30 BENEFITS AND COSTS OF THE DATABASE APPROACH 

31 Problems with retrieving data in both batch and interactive processing systems using the 

32 traditional file approach led to the basic concept behind the database. With the database there is 

33 one set of uniquely defined data items, and all computer applications use the same data items that 

34 are separate from the applications that use them. This setup allows analysis of the same data across 

35 applications. It also means that the applications and the data can be changed independent of each 

36 other, so data can be added to, modified, or deleted from the database without the programs using 

37 them being affected. 

38 For example, a company may have one set of prices for materials used by inventory 

39 control for costing issues, another set of prices in the engineering department used for design of 

40 new or revised products, and still another set of prices used by the purchasing department for 

41 determining sources. These different sets of prices are updated at different times by different 

42 people from different information. Needless to say, the prices probably rarely agree, even though 

43 they presumably represent the same thing. The database approach to this problem is to have one 

44 set of prices for materials and then have each application use the same information. 

45 The database approach also has simplified applications development. All file systems have 

46 the same basic components for file creation, maintenance, transaction processing, and report 

47 writing. Once the applications are separated from the data, these programs can be developed just 

48 once for all application data. Previously, there had been much duplication of effort in the 

49 development of these programs because they were tied to the specific files they used. 

50 But the database approach is not without its costs. The main cost involves coordination. 

51 If the same number will uniquely identify a supplier in the ordering system, the accounts payable 

52 system, and the inventory system, someone or some group must coordinate the design of these 

53 systems. The business cannot allow separate groups to develop systems independently. But the 

54 price of coordination can be higher than a company is willing to pay. Also, because each system 

55 is not designed unto itself, certain compromises must be accepted in individual components so that 

56 the total system will fit together. As a result, each component may not be optimal for a particular 

57 task, frequently a concern of users who are more interested in optimizing one specific subsystem, 

58 such as inventory, than in optimizing the overall company's operations. 

59 RELATIONAL DATABASES 

60 Businesses use data items, records, and files to keep track of their operations and 

61 accounting data. The most useful way to visualize data items and records is to see them as a table 

62 of information called a flat file. The term flat file is used because the information can be viewed 

63 in two dimensions: rows (records) and columns (data items) similar to tables of data in a book. 

64 Table 1A has a row for each (customer) record in the file and a column for each of the four data 

65 items. For convenience, the name of each data item, such as customer number, is at the top of the 

66 appropriate column. 

67 For a flat file to be able to store and analyze data, it must have the following 

68 characteristics: 

69 1. All items in each column must be the same kind of data, such as a customer number, 

70 a customer name, or a customer address. 

71 2. Each column must have its own unique name, separate from all others. In this case, the 
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72 names are cust-no, cust-name, cust-addrl, and cust-addr2. 

73 3. All rows must be different in at least one data item from every other row. In other 

74 words, two rows of data cannot be exactly alike. If two rows are alike, either they refer to the 

75 same customer, so the duplicate can be eliminated, or they refer to two separate customers, in 

76 which case there must be one or more data items to distinguish between those customers. 

77 4. Every cell (the intersection of a row and a column) contains only one data item. Thus 

78 every customer has exactly one cust-no, one cust-name, and so on. 

79 Having one and only one data item per cell is significant to the design and number of flat 

80 files. Suppose a manager is interested in all invoices for a particular customer. Some customers 

81 will have only one invoice while others will have two, four, or more. These invoices would not, 

82 therefore, fit into a flat file without some modification because each cell must contain only one 

83 item. A situation such as multiple invoices for a customer is called a repeating group because there 

84 potentially is more than one data item for a customer. Fortunately, repeating groups can be dealt 

85 with by forming two flat files. The process of forming additional flat files from the repeating 

86 groups is called normalization. 

87 Another name for a flat file is a relation because it represents a relationship among the 

88 various data items of the file. In Table 1C, the relation is that all the data items in one record 

89 represent one customer. On the other hand, there are two relationships shown in Table 1, one of 

90 customers (A) and one of invoices (B). In B, there is one record for each invoice. If there are 

91 three invoices for each of six customers, there would be 18 invoices and, thus, 18 records in the 

92 invoice file and 18 rows in the table. But there would be only six customer records in the customer 

93 relation (flat file). 

94 Some database software can process only flat files, but because it is possible to transform 

95 any repeating group into a series of flat files, this restriction is not serious. Databases that are 

96 based on flat files and relations are called relational databases. This concept of transforming 

97 repeating groups into flat files does create duplication of data. In the invoice example, the 

98 customer number is repeated in both the customer file and the invoice file. But relational databases 

99 are easy to use and provide flexibility in handling the data, which in most applications outweighs 

100 this repetition of data. 

101 RELATIONAL DATABASE MANAGEMENT SYSTEMS (RDBMS) 

102 (Table Omitted) Captioned as: Table 1 

103 There has been a great deal of experimentation with different ways to accomplish the 

104 database approach to business processing. Three approaches have turned out to be the most 

105 common: the hierarchical or tree structure, the network or plex structure, and the relational or 

106 table structure. 

107 Hierarchical or tree approach. The first major database was developed at Rockwell 

108 International for the purpose of tracking the development of the Apollo space program. The 

109 resulting computer program later became known as IMS (Information Management System) when 

110 it was sold by IBM. This database had a hierarchical focus: The product (spacecraft) was 

111 composed of subassemblies and parts, and each subassembly was composed of further 



3 



112 subassemblies and parts. Eventually, all subassemblies were broken down into their component 

113 parts. A complete breakdown of a product into its component parts often is called a bill of 

114 materials. 

115 A hierarchical file, such as a bill of materials detailing the components of a manufactured 

1 16 product, has a tree structure relationship between the records of the file. (See Figure 1.) A tree 

117 is composed of a hierarchy of elements called nodes. The uppermost level of the hierarchy has 

118 only one node, called the root. In our example, this root would correspond to the finished 

1 19 spacecraft. With the exception of the root, every node has another node related to it at a higher 

120 level, called its parent. No element can have more than one parent. Each parent can have one or 

121 more elements related to it at a lower level, called children; they would be the subassemblies and 

122 the parts that compose them. Elements with no nodes in the next level down are called leaves, 

123 which would be individual parts with no assembly. Therefore, each node (component) has only 

124 one parent (the component it goes into), the root has no parent (because it is complete), and leaves 

125 have no children (because they are not assembled). Note that if you look at each element in Level 

126 2 of Figure 1 and think of it as a root, then its children and descendants form a tree. A master file 

127 transaction structure also can be thought of as a hierarchical or tree file structure. 

128 The hierarchical approach has the advantage of extremely fast transaction processing. The 

129 disadvantages of this approach are that it is extremely complex to set up, often requiring months 

130 for the initial project, and it is very difficult to maintain and change as circumstances and data 

131 change. Consequently, the hierarchical approach and the IMS program are suitable only for highly 

132 structured and extremely highvolume transaction processing environments. 

133 Network or plex approach. The network approach (or plex structure) exists somewhere 

134 between the hierarchical and relational approaches in both speed and ease of use and has fallen 

135 somewhat out of favor. Users requiring speed choose the hierarchical approach, while those 

136 desiring ease of use choose the relational approach. As a result, there is virtually no new 

137 development of network DBMSs or new applications using a network DBMS. 

138 If a child in a data relationship has more than one parent, the structure is a network or plex 

139 structure. As in tree structures, plex structures may have levels. Figure 2 shows the network 

140 structure of a purchasing system with five record types. Each relationship is a parent-child 

141 relationship. The purchase order record type is a child of the part (that is, inventory item) record 

142 type and a parent of the purchase item record type. A more complex structure is a oneto-many 

143 relationship in both directions between part and purchase order. Each part (inventory item) can 

144 be purchased using many different purchase orders, and each purchase order can be used for many 

145 different parts. 

146 The network or plex structure approach is easier to use (although slower) than the rigid tree 

147 or hierarchical approach, and it is the basis for the earlier market success of non-IBM systems 

148 such as integrated data management systems (IDMS). The major difficulty with the network 

149 approach is that the queries that the system can answer efficiently must be designed into the 

150 system. Queries, however, often arise that have not been planned for. 

151 Relational approach. In 1969 at IBM, mathematician E.F. Codd developed the relational 

152 theory of data that he proposed as a universal foundation for database systems. Codd's theory 

153 formed the basis for all further work in this area. A relational DBMS (RDBMS) satisfies four 

154 conditions. 
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155 Information-All information in the RDBMS is represented in one way only-as values in 

156 tables, which allows users to access, understand, and manipulate data more easily. Each data value 

157 should be accessible by the combination of the name of the table in which the data are stored, the 

158 name of the column under which stored, and the primary key that identifies the row in which 

159 stored. 

160 (Chart Omitted) Captioned as: Figure 1. 

161 (Chart Omitted) Captioned as: Figure 2. 

162 R elational l a n guage- T he R P BM S r equires a data l anguage to d efine d ata; d ef in e l ogic a l 

163 displays of data : manipulate data ; establish rules to prevent errors, such as acceptable values for 

164 codes or a required connection between a master file and a transaction file: and maintain 

165 authorization of those users able to access the data . This language must be able to process entire 

166 tables (not row-by-row) for queries and also for data modification, insertion, and deletion. 

167 Independence-Users must not be required to modify queries or application programs if the 

168 database has been reorganized and there is no loss of information in the base tables. This would 

169 include instances where data are moved (distributed) from one computer to another to be closer 

170 to their source or to the location where they are used more often. Integrity management, that is, 

171 the maintenance of required links between tables and the validity of values for the data items, 

172 should not be duplicated in each application but are implemented by the DBMS 

173 Views-The RDBMS must be able to create logical tables (called views) from the base 

174 tables. For example, the information about an individual that a payroll clerk is authorized to look 

175 at could be put into a view that the payroll clerk would be able to access. The view could be 

176 queried and processed just as a base table would be, even though the original data are not repeated 

177 in the view and all of the data are stored only once in the base table. 

178 ADVANTAGES OF THE RELATIONAL APPROACH 

179 The relational approach is based on tables of data in rows and columns, with operations 

180 defined on those tables. Yet these tables must possess the four characteristics of the relational 

181 approach described above. The RDBMS (not the user) must ensure that all database tables comply 

182 with these requirements. When they do, the RDBMS is able to apply mathematical operations and 

183 strict logic to them, which eliminates traditional deficiencies of DBMSs and offers significant 

184 practical benefits. The table structure of an RDBMS is simple and familiar. It is general enough 

185 to represent most types of data, is independent of any internal computer mechanisms, and it is 

186 flexible because the user can restructure tables. Transaction processing is slower than with other 

187 approaches, but modifying the structure of files and adding data items (columns) is considerably 

188 easier. Also, the relational approach allows relationships between tables to be created later, after 

189 the data tables have been developed and the data entered. In the hierarchical and network 

190 approaches, allowable queries about the data have to be identified before the database is developed 

191 so that the pointers between files and records can be created along with the database. 

192 Data manipulation by an RDBMS is managed by a well-defined, complete set of 

193 mathematical operations, which always yield tables as results. With relational operations, data 
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194 access no longer needs to be procedural. The user can specify a data request by giving the 

195 operations that must be performed on other tables to derive it. The system translates these requests 

196 into sets of efficient processing steps. A relational DBMS can accumulate information about the 

1 97 database (such as statistics) in a catalog 

198 to optimize these operations. 

1 99 SYSTEMS AND TRENDS 

200 (Photograph Omitted) Captioned as: Relational databases translate requests for 

201 information into logical processing steps. 

202 Leading relational programs for mainframes are DB2 for the MVS operating system and 

203 SQL/DS for the VM/CMS operating system. An SQL/DS application, for example, can have up 

204 to 70 million rows, hundreds of tables, and thousands of columns. SQL/DS is installed on more 

205 than 7,500 mainframes and costs more than $100,000 per installation. Other leading relational 

206 databases are Oracle and INGRES for minicomputers, dBASE and Paradox for microcomputers. 

207 The mathematical and logical basis of the relational approach makes it a natural candidate 

208 for a database standard. A standard based on the relational model would yield the best of all 

209 worlds: The products that complied would offer both relational features and compatibility with 

210 a defined standard, and the underlying database functions would be the same for all products, 

211 regardless of whether they are designed for a single user on a PC or multiple users in more 

212 sophisticated systems. In addition, tools such as spreadsheets and word processors do operate on 

213 some of these databases. Both the American National Standards Institute and the International 

214 Standards Organization have developed standards, and all DBMSs are moving toward them. 

215 RDBMS are moving toward support of distributed databases, which are databases spread 

216 throughout the computer systems in a network. One benefit of a distributed database is that local 

217 data can be retrieved without any network activity, thus reducing communications costs when 

218 compared with a centralized database at a remote site. Another potential advantage is that each 

219 database can be sized appropriately for its amount of data, the complexity of user requirements, 

220 and the number of users. As the system grows, added demand can be met more easily than with 

221 a centralized system by making smaller changes to existing databases or by adding new databases 

222 to a network. Current RDBMSs deliver these benefits by allowing a collection of database 

223 operations (called a unit of work) to retrieve and update data at a remote site. Future capabilities 

224 will add support for a distributed unit of work, which allows a user to access data at multiple 

225 locations simultaneously. 

226 RDBMSs are moving toward providing access to the data for applications running on 

227 remote computers. This style of distributed computing is called client/server, where the computer 

228 providing access to the data is called the database server, and the remote computer requesting the 

229 data is called the client. 

230 In client/server, one branch of a company in one location may have primary contact and 

231 conduct virtually all transactions with one segment of the company's customers, while other 

232 branches work with other customers. It is more efficient from the standpoint of data storage, 

233 transaction processing, and data communications if each branch maintains the data files for its 
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234 customers while allowing other branches access to the data. This approach is called distributed 

235 data processing because the databases are distributed around the operational locations of the firm. 

236 If the databases in different branches or divisions are not connected and are kept separate, the 

237 company is operating with a decentralized data processing system. 

238 There are four goals for a distributed database system: 

239 A distributed system should appear to each user to be a single, nondistributed system so 

240 that queries and transactions that affect distributed data look no different from local queries and 

241 transactions. 

242 Each location should have local autonomy and not require the approval of some centralized 

243 group for local changes. 

244 A central site should not be required for data storage or processing. 

245 Operation should be continuous. 

246 To achieve these goals, there are several features of a distributed DBMS that should be 

247 transparent and of no concern to users. Table 2 summarizes them. 

248 As accounting data processing moves away from centralized mainframe processing it 

249 moves toward either decentralized processing, with totally separate databases, or distributed 

250 database systems. Proof of the flexibility of the RDBMS is its ability to adjust to the newer, much 

251 more complicated ways of dealing with data. 

252 (Table Omitted) Captioned as: Table 2. 
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